智能论文笔记

MDistMult: A Multiple Scoring Functions Model for Link Prediction on Antiviral Drugs Knowledge Graph

Weichuan Wang , Zhiwen Xie , Jin Liu , Yucong Duan , Bo Huang , Junsheng Zhang

分类：人工智能

2021-11-29

Covid-19上的知识图（KGS）已建立在加速Covid-19的研究过程中。然而，KGs总是不完整，特别是新建造的Covid-19公斤。链路预测任务旨在预测（e，r，t）或（h，r，e）的丢失实体，其中H和t是某些实体，E是需要预测的实体，R是关系。这项任务还有可能解决Covid-19相关的KGS的不完全问题。虽然已经提出了各种知识图形嵌入（KGE）方法的链路预测任务，但这些现有方法遭受了使用单个评分函数的限制，这不能捕获Covid-19 Kgs的丰富特征。在这项工作中，我们提出了利用多个评分函数来提取来自现有三元组的更多特征的MDistmult模型。我们在CCKS2020 Covid-19抗病毒药物知识图（CADKG）上采用实验。实验结果表明，我们的MDistmult在CADKG数据集上的链路预测任务中实现了最先进的性能

translated by 谷歌翻译

Improving Stack Overflow question title generation with copying enhanced CodeBERT model and bi-modal information

Fengji Zhang , Xiao Yu , Jacky Keung , Fuyang Li , Zhiwen Xie , Zhen Yang , Caoyuan Ma , Zhimin Zhang

分类：自然语言处理 | 人工智能

2021-09-27

上下文：堆栈溢出对于寻求编程问题答案的软件开发人员非常有帮助。先前的研究表明，越来越多的问题质量低，因此从潜在的答案者那里获得了更少的关注。 Gao等。提出了一个基于LSTM的模型（即BilstM-CC），以自动从代码片段中生成问题标题，以提高问题质量。但是，只有在问题主体中使用代码段无法为标题生成提供足够的信息，而LSTMS无法捕获令牌之间的远程依赖性。目的：本文提出了基于深度学习的新型模型CCBERT，旨在通过充分利用整个问题主体的双模式信息来增强问题标题生成的性能。方法：CCBERT遵循编码器范式范式，并使用Codebert将问题主体编码为隐藏的表示形式，堆叠的变压器解码器以生成预测的代币，以及附加的复制注意层来完善输出分布。编码器和解码器都执行多头自我注意操作，以更好地捕获远程依赖性。本文构建了一个数据集，该数据集包含大约200,000个高质量问题，该数据从Stack Overflow正式发布的数据中滤除，以验证CCBERT模型的有效性。结果：CCBERT优于数据集上的所有基线模型。对仅代码和低资源数据集进行的实验表明，CCBERT的优势性能较小。人类评估还显示了CCBERT关于可读性和相关标准的出色表现。

translated by 谷歌翻译

ConvNeXt V2: Co-designing and Scaling ConvNets with Masked Autoencoders

Sanghyun Woo , Shoubhik Debnath , Ronghang Hu , Xinlei Chen , Zhuang Liu , In So Kweon , Saining Xie

分类：计算机视觉

2023-01-02

Driven by improved architectures and better representation learning frameworks, the field of visual recognition has enjoyed rapid modernization and performance boost in the early 2020s. For example, modern ConvNets, represented by ConvNeXt, have demonstrated strong performance in various scenarios. While these models were originally designed for supervised learning with ImageNet labels, they can also potentially benefit from self-supervised learning techniques such as masked autoencoders (MAE). However, we found that simply combining these two approaches leads to subpar performance. In this paper, we propose a fully convolutional masked autoencoder framework and a new Global Response Normalization (GRN) layer that can be added to the ConvNeXt architecture to enhance inter-channel feature competition. This co-design of self-supervised learning techniques and architectural improvement results in a new model family called ConvNeXt V2, which significantly improves the performance of pure ConvNets on various recognition benchmarks, including ImageNet classification, COCO detection, and ADE20K segmentation. We also provide pre-trained ConvNeXt V2 models of various sizes, ranging from an efficient 3.7M-parameter Atto model with 76.7% top-1 accuracy on ImageNet, to a 650M Huge model that achieves a state-of-the-art 88.9% accuracy using only public training data.

translated by 谷歌翻译

A Sequential Quadratic Programming Method with High Probability Complexity Bounds for Nonlinear Equality Constrained Stochastic Optimization

Albert S. Berahas , Miaolan Xie , Baoyu Zhou

分类： (统计)机器学习

2023-01-01

A step-search sequential quadratic programming method is proposed for solving nonlinear equality constrained stochastic optimization problems. It is assumed that constraint function values and derivatives are available, but only stochastic approximations of the objective function and its associated derivatives can be computed via inexact probabilistic zeroth- and first-order oracles. Under reasonable assumptions, a high-probability bound on the iteration complexity of the algorithm to approximate first-order stationarity is derived. Numerical results on standard nonlinear optimization test problems illustrate the advantages and limitations of our proposed method.

translated by 谷歌翻译

Disjoint Masking with Joint Distillation for Efficient Masked Image Modeling

Xin Ma , Chang Liu , Chunyu Xie , Long Ye , Yafeng Deng , Xiangyang Ji

分类：计算机视觉

2022-12-31

Masked image modeling (MIM) has shown great promise for self-supervised learning (SSL) yet been criticized for learning inefficiency. We believe the insufficient utilization of training signals should be responsible. To alleviate this issue, we introduce a conceptually simple yet learning-efficient MIM training scheme, termed Disjoint Masking with Joint Distillation (DMJD). For disjoint masking (DM), we sequentially sample multiple masked views per image in a mini-batch with the disjoint regulation to raise the usage of tokens for reconstruction in each image while keeping the masking rate of each view. For joint distillation (JD), we adopt a dual branch architecture to respectively predict invisible (masked) and visible (unmasked) tokens with superior learning targets. Rooting in orthogonal perspectives for training efficiency improvement, DM and JD cooperatively accelerate the training convergence yet not sacrificing the model generalization ability. Concretely, DM can train ViT with half of the effective training epochs (3.7 times less time-consuming) to report competitive performance. With JD, our DMJD clearly improves the linear probing classification accuracy over ConvMAE by 5.8%. On fine-grained downstream tasks like semantic segmentation, object detection, etc., our DMJD also presents superior generalization compared with state-of-the-art SSL methods. The code and model will be made public at https://github.com/mx-mark/DMJD.

translated by 谷歌翻译

Guided Hybrid Quantization for Object detection in Multimodal Remote Sensing Imagery via One-to-one Self-teaching

Jiaqing Zhang , Jie Lei , Weiying Xie , Yunsong Li , Xiuping Jia

分类：计算机视觉

2022-12-31

Considering the computation complexity, we propose a Guided Hybrid Quantization with One-to-one Self-Teaching (GHOST}) framework. More concretely, we first design a structure called guided quantization self-distillation (GQSD), which is an innovative idea for realizing lightweight through the synergy of quantization and distillation. The training process of the quantization model is guided by its full-precision model, which is time-saving and cost-saving without preparing a huge pre-trained model in advance. Second, we put forward a hybrid quantization (HQ) module to obtain the optimal bit width automatically under a constrained condition where a threshold for distribution distance between the center and samples is applied in the weight value search space. Third, in order to improve information transformation, we propose a one-to-one self-teaching (OST) module to give the student network a ability of self-judgment. A switch control machine (SCM) builds a bridge between the student network and teacher network in the same location to help the teacher to reduce wrong guidance and impart vital knowledge to the student. This distillation method allows a model to learn from itself and gain substantial improvement without any additional supervision. Extensive experiments on a multimodal dataset (VEDAI) and single-modality datasets (DOTA, NWPU, and DIOR) show that object detection based on GHOST outperforms the existing detectors. The tiny parameters (<9.7 MB) and Bit-Operations (BOPs) (<2158 G) compared with any remote sensing-based, lightweight or distillation-based algorithms demonstrate the superiority in the lightweight design domain. Our code and model will be released at https://github.com/icey-zhang/GHOST.

translated by 谷歌翻译

Symbolic Visual Reinforcement Learning: A Scalable Framework with Object-Level Abstraction and Differentiable Expression Search

Wenqing Zheng , S P Sharan , Zhiwen Fan , Kevin Wang , Yihan Xi , Zhangyang Wang

分类：机器学习 | 人工智能

2022-12-30

Learning efficient and interpretable policies has been a challenging task in reinforcement learning (RL), particularly in the visual RL setting with complex scenes. While neural networks have achieved competitive performance, the resulting policies are often over-parameterized black boxes that are difficult to interpret and deploy efficiently. More recent symbolic RL frameworks have shown that high-level domain-specific programming logic can be designed to handle both policy learning and symbolic planning. However, these approaches rely on coded primitives with little feature learning, and when applied to high-dimensional visual scenes, they can suffer from scalability issues and perform poorly when images have complex object interactions. To address these challenges, we propose \textit{Differentiable Symbolic Expression Search} (DiffSES), a novel symbolic learning approach that discovers discrete symbolic policies using partially differentiable optimization. By using object-level abstractions instead of raw pixel-level inputs, DiffSES is able to leverage the simplicity and scalability advantages of symbolic expressions, while also incorporating the strengths of neural networks for feature learning and optimization. Our experiments demonstrate that DiffSES is able to generate symbolic policies that are simpler and more and scalable than state-of-the-art symbolic RL methods, with a reduced amount of symbolic prior knowledge.

translated by 谷歌翻译

DGFont++: Robust Deformable Generative Networks for Unsupervised Font Generation

Xinyuan Chen , Yangchen Xie , Li Sun , Yue Lu

分类：计算机视觉 | 人工智能

2022-12-30

Automatic font generation without human experts is a practical and significant problem, especially for some languages that consist of a large number of characters. Existing methods for font generation are often in supervised learning. They require a large number of paired data, which are labor-intensive and expensive to collect. In contrast, common unsupervised image-to-image translation methods are not applicable to font generation, as they often define style as the set of textures and colors. In this work, we propose a robust deformable generative network for unsupervised font generation (abbreviated as DGFont++). We introduce a feature deformation skip connection (FDSC) to learn local patterns and geometric transformations between fonts. The FDSC predicts pairs of displacement maps and employs the predicted maps to apply deformable convolution to the low-level content feature maps. The outputs of FDSC are fed into a mixer to generate final results. Moreover, we introduce contrastive self-supervised learning to learn a robust style representation for fonts by understanding the similarity and dissimilarities of fonts. To distinguish different styles, we train our model with a multi-task discriminator, which ensures that each style can be discriminated independently. In addition to adversarial loss, another two reconstruction losses are adopted to constrain the domain-invariant characteristics between generated images and content images. Taking advantage of FDSC and the adopted loss functions, our model is able to maintain spatial information and generates high-quality character images in an unsupervised manner. Experiments demonstrate that our model is able to generate character images of higher quality than state-of-the-art methods.

translated by 谷歌翻译

NeRF-Gaze: A Head-Eye Redirection Parametric Model for Gaze Estimation

Pengwei Yin , Jiawu Dai , Jingjing Wang , Di Xie , Shiliang Pu

分类：计算机视觉

2022-12-30

Gaze estimation is the fundamental basis for many visual tasks. Yet, the high cost of acquiring gaze datasets with 3D annotations hinders the optimization and application of gaze estimation models. In this work, we propose a novel Head-Eye redirection parametric model based on Neural Radiance Field, which allows dense gaze data generation with view consistency and accurate gaze direction. Moreover, our head-eye redirection parametric model can decouple the face and eyes for separate neural rendering, so it can achieve the purpose of separately controlling the attributes of the face, identity, illumination, and eye gaze direction. Thus diverse 3D-aware gaze datasets could be obtained by manipulating the latent code belonging to different face attributions in an unsupervised manner. Extensive experiments on several benchmarks demonstrate the effectiveness of our method in domain generalization and domain adaptation for gaze estimation tasks.

translated by 谷歌翻译

DRG-Net: Interactive Joint Learning of Multi-lesion Segmentation and Classification for Diabetic Retinopathy Grading

Hasan Md Tusfiqur , Duy M. H. Nguyen , Mai T. N. Truong , Triet A. Nguyen , Binh T. Nguyen , Michael Barz , Hans-Juergen Profitlich , Ngoc T. T. Than , Ngan Le , Pengtao Xie

分类：计算机视觉

2022-12-30

Diabetic Retinopathy (DR) is a leading cause of vision loss in the world, and early DR detection is necessary to prevent vision loss and support an appropriate treatment. In this work, we leverage interactive machine learning and introduce a joint learning framework, termed DRG-Net, to effectively learn both disease grading and multi-lesion segmentation. Our DRG-Net consists of two modules: (i) DRG-AI-System to classify DR Grading, localize lesion areas, and provide visual explanations; (ii) DRG-Expert-Interaction to receive feedback from user-expert and improve the DRG-AI-System. To deal with sparse data, we utilize transfer learning mechanisms to extract invariant feature representations by using Wasserstein distance and adversarial learning-based entropy minimization. Besides, we propose a novel attention strategy at both low- and high-level features to automatically select the most significant lesion information and provide explainable properties. In terms of human interaction, we further develop DRG-Net as a tool that enables expert users to correct the system's predictions, which may then be used to update the system as a whole. Moreover, thanks to the attention mechanism and loss functions constraint between lesion features and classification features, our approach can be robust given a certain level of noise in the feedback of users. We have benchmarked DRG-Net on the two largest DR datasets, i.e., IDRID and FGADR, and compared it to various state-of-the-art deep learning networks. In addition to outperforming other SOTA approaches, DRG-Net is effectively updated using user feedback, even in a weakly-supervised manner.

translated by 谷歌翻译